46 research outputs found

    many faces, many places (Term21)

    Get PDF
    UIDB/03213/2020 UIDP/03213/2020publishersversionpublishe

    Leveraging a Narrative Ontology to Query a Literary Text

    Get PDF
    In this work we propose a model for the representation of the narrative of a literary text. The model is structured in an ontology and a lexicon constituting a knowledge base that can be queried by a system. This narrative ontology, as well as describing the actors, locations, situations found in the text, provides an explicit formal representation of the timeline of the story. We will focus on a specific case study, that of the representation of a selected portion of Homer\u27s Odyssey, in particular of the knowledge required to answer a selection of salient queries, formulated by a literary scholar. This work is being carried out within the framework of the Semantic Web by adopting models and standards such as RDF, OWL, SPARQL, and lemon among others

    Modelling frequency and attestations for OntoLex-Lemon

    Get PDF
    The OntoLex vocabulary enjoys increasing popularity as a means of publishing lexical resources with RDF and as Linked Data. The recent publication of a new OntoLex module for lexicography, lexicog, reflects its increasing importance for digital lexicography. However, not all aspects of digital lexicography have been covered to the same extent. In particular, supplementary information drawn from corpora such as frequency information, links to attestations, and collocation data were considered to be beyond the scope of lexicog. Therefore, the OntoLex community has put forward the proposal for a novel module for frequency, attestation and corpus information (FrAC), that not only covers the requirements of digital lexicography, but also accommodates essential data structures for lexical information in natural language processing. This paper introduces the current state of the OntoLex-FrAC vocabulary, describes its structure, some selected use cases, elementary concepts and fundamental definitions, with a focus on frequency and attestations

    OntoLex-Morph: Morphology for the Web of Data

    Get PDF
    Purpose: OntoLex-Lemon is a widely used community standard for publishing lexical resources in machine-readable form, and is in fact the predominant RDF vocabulary for this purpose. With the growing popularity and increasing adoption of this model for applications in both language technology and lexicography, a number of new modules have been developed in the past year to complement the OntoLex core vocabulary and its lexicographic follow up, lexicog. In this paper, we describe the current status of the development of the OntoLex-Morph vocabulary

    Historiae, History of Socio-Cultural Transformation as Linguistic Data Science. A Humanities Use Case

    Get PDF
    The paper proposes an interdisciplinary approach including methods from disciplines such as history of concepts, linguistics, natural language processing (NLP) and Semantic Web, to create a comparative framework for detecting semantic change in multilingual historical corpora and generating diachronic ontologies as linguistic linked open data (LLOD). Initiated as a use case (UC4.2.1) within the COST Action Nexus Linguarum, European network for Web-centred linguistic data science, the study will explore emerging trends in knowledge extraction, analysis and representation from linguistic data science, and apply the devised methodology to datasets in the humanities to trace the evolution of concepts from the domain of socio-cultural transformation. The paper will describe the main elements of the methodological framework and preliminary planning of the intended workflow

    Tracing Semantic Change with Multilingual LLOD and Diachronic Word Embeddings

    Get PDF
    Purpose: The project will combine word embedding techniques and linguistic linked open data (LLOD) with theoretical aspects from lexical semantics, the history of concepts, and knowledge organization to trace the evolution of concepts in a collection of multilingual diachronic corpora of seven extinct and extant languages (Latin, Ancient Greek, Hebrew, French, Old Lithuanian, Romanian, German). The outcome will consist of a sample of diachronic ontologies to be published on the LLOD cloud. It will also comprise reflections on the potential interconnections across different languages that can be built through these knowledge structures

    Interlinking Lexicographic Data in the MORDigital Project

    Get PDF
    Purpose: To introduce MORDigital as an innovative Portuguese national project that incorporates the latest results in computational lexicography, the digital humanities, and linguistic linked data. In particular, we will show how it brings together work in the development of TEI Lex-0 and OntoLex-Lemon, as well as recent innovations on the conversion of retrodigitized dictionaries into computational lexical resources (using in this case the GROBID-dictionaries tool)

    Workflow Reversal and Data Wrangling in Multilingual Diachronic Analysis and Linguistic Linked Open Data Modelling

    Get PDF
    peer reviewedThe article deals with data wrangling in a multilingual collection intended for diachronic analysis and linguistic linked open data modelling for tracing concept change over time. Two types of static word embeddings are used: word2vec (French and Hebrew data sets), and fastText (Latin and Lithuanian data sets). We model examples from these embeddings via the OntoLex-FrAC formalism. To address the challenge of heterogeneity, we use a minimalist workflow design allowing for both convergence and flexibility in attaining the project goals.CA18209 - European network for Web-centred linguistic data science (NexusLinguarum

    Towards a Conversational Web? A Benchmark for Analysing Semantic Change with Conversational Bots and Linked Open Data

    Get PDF
    peer reviewedThe paper presents preliminary results from our experiments with large language models, linked data, and semantic change in multilingual diachronic contexts. It proposes the first steps towards a benchmark and aims at fostering discussion on the concept of conversational knowledge bots as emerging paradigms, and the use of linked open data in linguistic tasks.CA18209 - European network for Web-centred linguistic data science (NexusLinguarum
    corecore